General Disclaimer 


One or more of the Following Statements may affect this Document 


• This document has been reproduced from the best copy furnished by the 
organizational source. It is being released in the interest of making available as 
much information as possible. 


• This document may contain data, which exceeds the sheet parameters. It was 
furnished in this condition by the organizational source and is the best copy 
available. 


• This document may contain tone-on-tone or color graphs, charts and/or pictures, 
which have been reproduced in black and white. 


• This document is paginated as submitted by the original source. 


• Portions of this document are not fully legible due to the historical nature of some 
of the material. However, it is the best reproduction available from the original 
submission. 


Produced by the NASA Center for Aerospace Information (CASI) 



JSC-14746 


Lockheed 

Electronics 


A SUBSIDIARY Of 
IOCKHM D CCMKKAl (ON 

1830 NASA Road I. Houston. Was 770S8 
Trt. 713-333-S41 1 


Company, Inc. 


T9" 1 • * 2 

NASA CR- 


.•Mads able under NASA sponsorship 

in the .:it:rcst cl aarly Wl ‘ 3 <*• T ' 
semir. .t.on ot W Hwces Sun-ey 


r :05:3m -nf r ’ . ■>’ 


.0 It I s if 


TECHNICAL MEMORANDUM 


Ref: 642-7325 

Job Order 73-705-03 
NAS 9-15800 


FURTHER EVALUATION OF PROCEDURE 1 SECONDARY ERROR ANALYSIS 


By 

K. A. Havens 


Approved By: 


-7 c 

t. C. Minter, Supervisor 
Techniques Development Section 


(E79-10225) LARGE AREA CROP INVENTORY N79-2S639 

EXPERIMENT (LACIE). FURTHER EVALUATION OF 
PROCEDURE 1 SECONDARY ERROR ANALYSIS 

(Lockheed Electronics Co.) 43 p Unclas 

HC A 0 3/M F A0 1 CSCL 05B G3/43 00225 


May 1979 


LEC-13180 



CONTENTS 


Section Page 

1. INTRODUCTION 1 

2. COMPARISON OF RESULTS TO IBM PI STUDY 3 

3. RE-EVALUATION OF SECONDARY ERROR ANALYSIS 7 

4. COMPUTATION OF THE VARIANCE OF THE ESTIMATE AND THE 

REDUCTION COEFFICIENT 27 

5. CONCLUSIONS 35 

6. REFERENCES 37 


#AG8 &LA.NK NOT 


v 



TABLES 


Table Page 

1 IBM RESULTS OF AVERAGE DIFFERENCES BETWEEN ESTIMATED AND 

GT WHEAT PROPORTIONS 4 

2 SECONDARY ERROR ANALYSIS - AVERAGE DIFFERENCES BETWEEN 

ESTIMATED AND GT SMALL-GRAIN PROPORTIONS 4 

3 ANALYSIS OF VARIANCE - AVERAGE DIFFERENCES 6 

4 NUMBER OF DOTS LABELED FOR EACH TREATMENT 8 

5 PCG1 DATA 9 

6 TRANSFORMED PCG1 DATA 10 

7 ANALYSIS OF VARIANCE FOR PCG1 12 

8 STATE-BY-TREATMENT MEANS FOR PCG1 12 

9 NEWMAN-KEULS TEST OF PCG1 13 

10 PCG2 DATA 14 

11 TRANSFORMED PCG2 DATA 15 

12 ANALYSIS OF VARIANCE FOR PCG2 16 

13 STATE-fiY-TREATMENT MEANS FOR PCG2 16 

14 NEWMAN-KEULS TEST OF PCG2 17 

15 SMALL-GRAIN PROPORTION ESTIMATES 18 

16 TRANSFORMED WINTER GRAIN PROPORTION ESTIMATES 20 

17 TRANSFORMED SPRING GRAIN PROPORTION ESTIMATES 21 

18 DIFFERENCES BETWEEN TRANSFORMED WINTER GRAIN PROPORTION 

ESTIMATES AND GT 22 

19 DIFFERENCES BETWEEN TRANSFORMED SPRING GRAIN AND PROPORTION 

ESTIMATES AND GT 23 


20 ANALYSIS IF VARIANCE FOR WINTER GRAIN PROPORTION DIFFERENCES . . 24 

21 ANALYSIS OF VARIANCE FOR SPRING CRAIN PROPORTION DIFFERENCES . . 24 


vii 


.«* «OT «MK>. 



Table Page 

22 STATE-BY-TREATMENT MEANS FOR WINTER GRAIN SEGMENTS 25 

23 STATE-BY-TREATMENT MEANS FOR SPRING GRAIN SEGMENTS 25 

24 GT-LABELED MACHINE CLASSIFICATION 31 

25 AI-LABELED MACHINE CLASSIFICATION 32 


vl 1 i 





FIGURES 

Figure Page 

1 GT-labeled machine classification 33 

2 Al-labeled machine classification 34 


1. INTRODUCTION 


At the close of the Phase III crop year of the Large Area Crop Inventory 
Experiment (LACIE), several Investigations were outlined In support of the 
Classification and Mensuration Subsystem. The goal of the secondary error 
analysis plan was to evaluate as many of the error sources as possible in 
the Procedure 1 (PI) small grains' estimate for 5- by 6-nautical -mi le segments 
in the U.S. Great Plains. An evaluation of analyst labeling errors on type 1 
and 2 dots was completed on a total of 25 test seqmentsJ However, because 
of the scope of the test, more analyses were required to provide an under- 
standing of the results. The further evaluation of PI error analysis was 
defined to include the following studies: 

1. A comparison of the classification results obtained in the PI secondary 

error analysis study to the PI study by the International Business 

o 

Machines Company (IBM). 

2. A re-evaluation of the classification results based on three criteria: 

(a) winter- vs. spring-grain segments, (b) computation of the probability 
of correct classification (PCC) for small grains only, and (c) the use 

of a signed difference between the proportion estimates and ground-truth 
(GT) estimates. 

3. The computation of the variance of the estimate and the corresponding 
reduction coefficient. 

These three evaluations are considered in this document, and the results for 
each evaluation are presented. 


The test segments are the following: 1005, 1032, 1033, 1853, 1861 (Kansas); 

1512, 1520 (Minnesota); 1544, 1739 (Montana); 1582 (Nebraska); 1604, 1606, 
1648, 1661, 1902 (North Dakota); 1231, 1242, 1367 (Oklahoma); 1677, 1690, 
1803, 1805 (South Dakota); and 1056, 1059, 1060 (Texas). 

2 

x BM memorandum from S. G. Wheeler to R. P. Heyrlor., dated June 20, 1977, 
re "Procedure 1 Evaluation Experiment with Gr.ur.d-Truth Labeling." 


V \ 



1 


2. COMPARISON OF RESULTS TO IBM H STUDY 


The IBM study, performed before the delivery of LACIE software (version 6) 
on the Earth Resources Interactive Processing System, was aimed at determining 
the best set of parameters for use in PI. Several parameter sets and cluster 
labeling procedures were tested. Including the nearest-neighbor cluster param- 
eter set and the cluster labeling technique now used in PI. Because of the 
wide scope of the IBM study, only one table of results was directly comparable 
to that of the secondary error analysis. Table 1 presents the IBM results 
of average differences between the estimated and the GT wheat proportions. 

The IBM study used GT-labeled picture elements (pixels) from field centers 
and nearest-neir iuor cluster parameters to classify seven test segments. 

The seven test segments used were 1033, 1561, 1988, 1865, 1178, 1046, and 1978. 
These seven segments were reclassified using varying numbers of channels: 8, 

12, and 16. The average differences found in table 1 indica f * > the amount of 
bias Introduced by the PI classifier trained using GT-labeled samples. The 
pixels were allocated by two methods; random and stratified, both from a dot- 
grid laid over the image. This was compared to average differences between 
the estimated and the GT proportions for the two treatments (table 2), which 
used GT labeling in the secondary error analysis study; i.e., the random dot- 
grid and the uniform dot-grid treatments. 

To test for method differences between the IBM study and the secondary error 
analysis study, an analysis of variance was performed on the data presented 
in tables 1 and 2 using a split plot design of the following form: 

v ave difference " + lX i + ; ^j + ^ a ^ij + 'k + + _ i j k 

where 

(kj, 1 8 1, 2, 3, represents channels 8, 12, and 16. 

Sj, j 8 1, 2, represents methods IBM and secondary error analysis. 

Y k • k * 1, 2, represents random (R) and uniform (U) treatments, 
p 8 overall mean. 


3 


m - 


- 


' Y blahi 



TABLE 1.- IBM RESULTS OF AVERAGE DIFFERENCES BETWEEN 
ESTIMATED AND GT WHEAT PROPORTIONS 


No. of 
channels 

No. of 
segments 

Treatments 

Random dot allocation 
from a grid 

Stratified dot allocation 
from a grid 

8 

7 

-0.187 

-0.012 

12 

7 

2.032 

1.535 

16 

7 

-0.497 

-1.064 

Average difference . 

0.449 

0.153 


TABLE 2.- SECONDARY ERROR ANALYSIS - AVERAGE 
DIFFERENCES BETWEEN ESTIMATED AND GT 
SMALL-GRAIN PROPORTIONS 


No. of 
channels 

No. of 
segments 

Treatments 

Random 

Uniform 

8 

9 

-1.489 

-2.044 

12 

6 

-1.050 

1.450 

16 

12 

-3.508 

1.658 

l 

Average difference . . | 

-2.016 

0.355 
























Results presented in table 3 showed no significant differences for any of the 
effects. However, due to the limited amount of data, the power of this 
analysis of variance is low. 



TABLE 3.- ANALYSIS OF VARIANCE - AVERAGE DIFFERENCES 


Source of n egrees of Sum of Mean F-value + 
variation freedom squares squares 

Channels 2 9.48438 4.74219 7.16 


Methods 

Error (a) 

Treatment 

Treatment 

methods 

Error (b) 

Total 


by 

** 



3.84201 

1.46523 

3.22611 

5.33333 

8.36291 

31.71397 


3.84201 

.73261 

3.22611 

5.33333 

2.09073 

2.88309 


5.24 

1.54 

2.55 


No significant differences between treatments at the 
5-percent level were noted. 

‘Coefficient of variation for error (a) * 323 percent. 
“Coefficient of variation for error (b) * 546 percent. 


6 













3. RE-EVALUATION OF SECONDARY ERROR ANALYSIS 


The secondary error analysis experiment (reported on In ref. 1) presented the 

PCC calculation for type 1 and 2 dots, denoted PCC1 and PCC2, respectively, 

for three different treatments: GT labeling of a random dot yrld (R), GT 

labeling of a uniform dot grid (U), and analyst-interpreter (AI) labeling of 

a random dot grid. The proportion estimate for each classification was 

compared to the GT estimate by computation of the absolute value of the 

difference, A. , where 1 varied over treatments. The data set consisted of 

25 Phase III blind sites in the U.S. Great Plains. (A complete description 
* 

of the data set and the experiment is given in ref. 1.) The re-evaluation of 
this experiment required that the PCC be computed for small grains for type 1 
and 2 dots; i.e., PCG1 and PCG2, respectively. After the segments were 
divided into winter- and spring-g ain segments, the proportion differences were 
calculated using the signed difference between the GT and treatment estimates 
(Ap to show the amount and direction cf bias. Each of these response 
variables, PCG1 , PCG2, and AT, underwent an analysis-of-varlance test. A 
Newman-Keuls multiple compairson test (ref. 3) was planned in the event that 
any of these analysis-of-vai lance tests indicated significant treatment 
di fferences. 

Table 4 presents the number of dots labeled for each treatment. Table 5 
presents the PCG1 vaiues for each of the three treatments. Three missing 
PCG1 values for the AI treatment were estimated from the block means by using 
the method of least squares. Since these data are presented in terms of per- 
centages ranging in value from 0 to 100, these data were transformed using 
the arc sine of the square root of the raw data (arc sin */PCGl ) . The 
transformed PCG1 data are presented in table 6. The results of the analysis 


3 

Since the Phase III GT did not identify barley, rye, and oats as winter or 
spring grains, the GT proportions were established by econometric models 
of the categories (ref. 2). 


7 



TABLE 4.- NUMBER OF DOTS LABELED FOR EACH TREATMENT 


Segment 


AI- and GT- labeled random grids 

GT*labeled uniform grids 

State 

No. of type t dott 

No. of type 2 dots 

No. of type 1 dott 

No. of type 2 dott 


Colo. 

39 

60 

43 

60 


Kant. 

39 

60 

49 

60 

warn 

Kant. 

50 

60 

50 

60 

1353 

Kant. 

31 

60 

44 

60 

1361 

Kant. 

45 

60 

47 

60 

mm 

Minn. 

46 

60 

47 

59 

Ins 

Minn. 

31 

60 

41 

60 

1544 

Mont. 

35 

60 

45 

60 

1739 

Mont . 

43 

60 

43 

60 

1552 

Neb. 

45 

60 

47 

59 

mm 


41 

60 

42 

60 

mm 



4 7 

30 

40 




60 

42 

58 

BSS: 



54 

36 

50 

wM 


1 R 

60 

45 

59 

■ 


36 

60 

42 

59 

BSE 


32 

50 

32 

53 

EH 

El'lili 

40 

60 

48 

59 

1677 

S. Jak. 

42 

53 

45 

52 

1600 

S. Oak. 

40 

60 

40 

60 

1003 

S. Dak. 

49 

60 

47 

50 

1005 

S. Dak. 

50 

99 

32 

40 


Tex. 

40 

60 

48 

60 


Tex . 

49 

60 

50 

60 

1060 

Tex. 

46 

60 

43 

60 


8 




















































TABLE 5.- PCG1 DATA 


Trea tment 


Uniform 



9 

















































TABLE 6.- TRANSFORMED 


Segment 

State 


Random 

1005 

Colo. 

54.76 

mm 

Kans. 

65.20 


Kans. 

26.56 

1853 

Kans. 

65.88 

1861 

Kans. 

65.88 


1512 

1520 


Minn. 

Minn. 


56.04 

90.00 


1544 

1739 


Mont. 

Mont. 


63.44 

61.89 


158 ? 


Neb. 


70.54 


1604 

1606 

1648 

1661 

1902 


1231 

1242 

1367 


N. Ddk . 
N. Dak. 
N. Dak. 
N. Dak. 
N. Dak. 


Okla . 
Okla. 
Okla. 


75.94 

72.34 

60.00 

54.76 

35.24 


90.00 

90.00 

73.26 


1677 

1690 

1803 

1805 

1056 

1059 

1060 


S. Dak. 
S. Dak. 
S. Dak. 
S. Dak. 

Tex. 

Tex. 

Tex. 


58.50 

90.00 

90.00 

58.50 

90.00 

62.24 

69.30 


10 


PCG1 DATA 


Treatment 










































of variance performed on the PCG1 transformed data are given In table 7. The 
state-by-treatment means of the PCG1 values were computed using the untrans- 
formed data and excluding the segment treatments for which a missing value 
was computed. These means are presented in table 8. The Newman-Keuls multiple 
comparison test of the table 7 analysis of variance appears in table 9. 

In the analysis of variance, an a-level of 0.05 was used to perform the 
F-tests. A significant difference was found in treatment effects. The Newman- 
Keuls test showed the Al-labeled treatment to be significantly different from 
either of the GT-labeled treatments. However, the GT-labeled treatments were 
not significantly different from each other. 

Table 10 presents the PCG2 values for each of the three treatments. The three 
missing PCG2 values for the AI treatment were estimated in the same manner 
as for the PCG1 values. Again, the data were transformed using the arc sine 
of the square root of the raw data; the data are given in table 11. The 
results of the analysis of variance performed on the transformed data appear in 
table 12. The state-by-treatment means computed in the same manner as the 
PCG1 values appear in table 13, and the treatment means of the Newman-Keuls 
are ranked in table 14. 

An a-level of 0.05 was again used to perform the F-tests with significant dif- 
•erences being found for the treatment effect and the segment wi thin-state 
effort. The Newman-Keuls test ranked the Al-labeled treatment with the 
GT-labeled uniform-dot treatment These two treatments were significantly 
different from the GT-labeled random-dot treatment. 

One can conclude from these analyses that the GT labeling should improve the 
probability of correctly classifying small grains. However, from the varying 
results Oi .he Newman-Keuls tests, it is unclear which dot grid is preferable 
for the labeling. The random dot grid was consistently the highest ranked 
according to treatment means; but for type 1 dots, it did not test as signifi- 
cantly different from the uniform dot grid. 


11 



TABLE 7.- ANALYSIS OF VARIANCE FOR PCGl 3 


Source of 
variation 

Degrees of 
freedom 

Su* of 
squares 

Mean 

square 

F-value 

Total 

b 71 

26 469.91 

358.73 


States 

8 

4 C89.49 

511.19 

c 2.91 

Segment within 
state 

16 

A 163.36 

522.71 

‘2.98 

Treatment 


5 1S9.30 

2 579.65 

d 14.69 

Treatment by 
state 


2 766.37 

172.84 

C <1 

Error* 


6 092.40 

175.60 



*the a naly sis was based on transformed data using arc 
sin .tffET. 

h Three missing values were estimated and used .1 the 
analysis. 

c Nr significant differences between means at the 5-perrent 
levil were noted. 

‘’Differences between means are significant at the 1-percent 
level. 

Coefficient of variation for this error • 2’. 5 percent 


TABLE 8.— STATE-BY-1 REATMENT MEANS 


State 

tin. of 
segments 

treatment 

State 

average 

Random 

Uniform 

EH 

Colo. 

1 

66.7 

81.3 

cc.o 

69.3 

Kans . 

4 

67.2 

75.2 

50. 2 b 

65.5 

Minn. 

2 

84.4 

59.3 

62. 5 C 

70.0 

Mont . 

2 

78.9 

71 4 

66. 7 d 

73.5 

Neb. 

1 

88.9 

100.0 

88.9 

92.6 

N. Dak. 

5 

72.1 

A7.4 

43.7 

67.7 

Ckla . 

3 

97.2 

90.7 

83.5 

90.5 

S. Dak. 

4 

86.4 

74.0 

50.7 

70.3 

Te* . 

3 

88.6 

85.7 

60.6 

78.3 

treatment average 

80.6 

80.2 

60.0 

73.6 


a The A1 treatment averages for each state did not 
Include the seqments for which a missing value 
was calculated. 

'’three segments were used for Kansas. 
c 0ne segment was used for Minnesota. 
d One segment was used for Montana. 


12 





































TABLE 9.- NEWMAN- KEULS TEST OF PCGl a 


Treatment 

Mean 



60. oj 

[■Uniform 


80.21 

L Random 


80.6-* 

d Thc W constants 

are 



derived as follows: 

W 2 = (2.89) (2.65) = 7.66 
W 3 » (3.49) (2.65) = 9.25 


13 






TABLE 10.- PCG2 DATA 



14 







































TABLE 1 1 TRANSFORMED PCG2 DATA 



15 









































TABLE 12.- ANALYSIS OF VARIANCE FOR PCG2 a 


Source of 
variation 

Deqrees 
of freedom 

Sum of 
squares 

Mean 

square 

F-value 

Total 

b 71 

30 930.96 

435.65 


States 

0 

6 151.91 

768.99 

c 3. 30 

Segment within 
state 

16 

13 623.90 

851.50 

d 3.6S 

Treatment 

2 

1 877.34 

938.67 

d 4.02 

Treatment by 
state 

16 

2 511.62 

156.98 

C *1 

Error* 

b 29 

6 766.09 

233.31 



a The analysis was based on transformed data using arc sin /PfG?. 

b Three missing values were estimated and used In the analysis. 

'No significant differences between means at the 5-percent 
level were noted. 

'’differences between means are significant at the 5-percent level. 
P Coeff1c1ent of variation for this error « 29.8 percent. 


TABLE 13.- STATE-BY-TREATMENT MEANS FOR PCG2 


State 

No. of 
segments 

Treatment 

State 

average 

Random 

Uniform 

Al* 

Colo. 


71.4 

62.1 

30.4 

54.6 

Kans. 

4 

58.3 

48.6 

50. a b 

52.7 

Minn. 

2 

36.2 

61.3 

57. l c 

50.6 

Mont . 

L. 

80.4 

42.4 

61. 7 d 

61.5 

Neb. 

1 

85.7 

83.3 

75.0 

81.3 

N. Dak. 

5 

56.5 

44.8 

38.1 

46.5 

Ok la 

3 

84.5 

79 . 2 

85.8 

83.2 

S. Dak. 

4 

82.4 

58.1 

40.9 

60.5 

Tex. 

3 

69.6 

72.6 

59.0 

67.1 

Treatment 

average 

• • ) 

58.4 

53.0 

57.7 


a The AI treatment averages for each state did not Include the 
segments for which a missing value was calculated. 

b Three segments were used for Kansas. 

c 0ne segment was used for Minnesota. 

d 0ne segment was used for Montana. 

16 


































TABLE 14. NEWMAN- KEULS 


Test of 

PCG2 a 

Treatment 

Means 

n 

53.0T 

l-Unlform 

58.4 J 

r 

1 

[ Random 

67.9 1 


a The W constants are 
derived as follows: 

W, » (3.05) (2.89) = 8.8 
W 3 = (3.05) (3.49) = 10.6 


17 





TABLE 15.- SMALL-GRAIN PROPORTION ESTIMATES 



State 

GT 

label 

Random 

Uniform 

Colo. 

38 

48 

Kans. 

37 

40 

Kans . 

9 

13 

Kans. 

35 

35 

Kans. 

6 

25 

Minn. 

16 

28 

Minn. 

22 

22 

Mont . 

60 

40 

Mont. 

21 

21 


12 

16 

Neb. 

16 

14 

N. Dak. 

53 

54 

N. Dak. 

25 

33 

N. Dak. 

33 

26 

N. Dak. 

37 

35 

N. Dak. 

11 

3 

Okla . 

72 

74 

Okla. 

50 

50 

Okla. 

62 

58 

S. Dak. 

28 

40 

S. Dak. 

18 

26 

S. Dak. 

2 

3 

S. Dak. 

1 

0 


16 

19 

Tex. 

17 

26 

Tex. 

38 

*+3 

Tex. 

20 

22 










38.6 
9.5 
30.3 
a 35. 3 







Based on a 400-dot estimate. 

:> The first estimates are for winter wheat; the second for 
spring wheat. 


18 




























































The small-grain proportion estimates and the GT estimates are presented in 
table 15. For one segment, a 400-dot count estimate was used in lieu of the 
GT estimate because of incomplete GT coverage. These proportions were 
transformed using the arc sine of the square root of each proportion estimate. 
The transformed data for winter gram segments are listed in table 16 and for 
spring grain segments in table 17. To analyze these data, the AI subtracted 
the transformed GT estimates from the transformed proportion estimates with 
the differences denoted as A'R, A'U, and A'AI. In each case, if the A'-value 
Is positive, it indicates an overestimate of the GT for that particular pro- 
cedure. Tables 18 and 19 present the differences for the winter grain estimates 
and the spring grain estimates, resnecti vely. The results of the analyses of 
variance performed on the difference tables appear in tables 20 and 21. 

Using an a-level of 0.05, no significant differences were found for any of the 
effects for the winter grain proportion differences. Because the winter grain 
areas consist of relatively large field sizes, the estimates were expected 
to be fairly close to the GT values. Thus, no statistical significances were 
expected from this analysis of variance. However, for the spring grain propor- 
tion differences, significant results were found for all effects tested: 
state, segment within state, treatment, and state by treatment. 

These significant results can be attributed to several problem areas that have 
been associated with spring grain estimation in previous phases of LACIE, 
such as strip fields, confusion crops, and adverse weather conditions. Tables 
of state-by-treatment means are presented for winter and spring grain segments 
in tables 22 and 23, respectively. Because the state-by-treatment interaction 
is statistically significant for spring grains, the comparisons are based on 
the state-by-treatment means. 

The least significant difference (LSD) values for comparing any two treatment 
means of the same state were computed (presented in table 23). Results indi- 
cate that for Minnesota and Montana, the uniform and AI treatments were sig- 
nificantly better than the random treatment but were not significantly dif- 
ferent from each other. For North Dakota and South Dakota, the random and 


19 



TABLE 16.- TRANSFORMED WINTER GRAIN 
PROPORTION ESTIMATES 


Segment 

State 

Labeling procedure 

GT 

Random 

Uniform 

A1 

1005 

Colo. 

38.06 

43.65 

26.56 

^6.09 

MM 

Kans. 

37.47 

39.23 

28.66 

38.41 


Kans. 

17.46 

21.13 

8.13 

17.95 


Kans. 

36.27 

36.27 

30.66 

33.40 

WSM 

Kans. 

14.18 

30.00 

33.83 

36.45 

1739 

Mont. 

27.28 

27.28 

18.44 

30.26 

1582 

Neb. 

23.58 

21.97 

25.10 

26.13 

■5BK 

OH a. 

58.05 



59.41 

■B9 

Okla. 

45.00 



43.39 

1367 

Okla. 

51.94 

49.60 

mr . • 

'■W’l 

47.29 

mm 

S. Dak. 


9.98 

8.13 

6.02 


S. Dak. 

H 

0.00 


6.29 

■SB 

Tex. 

24.35 

30.66 

34.45 

28.38 


Tex. 

38.06 

40.98 

38.06 

41.84 

HI 

Tex. 

26.56 

27.97 

24.35 

28.73 


20 




































TABLE 17.- TRANSFORMED SPRING GRAIN PROPORTION ESTIMATES 


Segment 

State 

Labeling procedure 

GT 

Random 

Uni form 

AI 

mm 


23.58 


33.83 

35.49 

■9 


27.97 


27.28 

33.21 

mm 

Mont. 


39.23 

40.98 

38.23 


Mont. 


23.58 

18.44 

9.98 

mm 

N. Dak. 

46.72 

47.29 

36.27 

47.52 


N. Dak. 

30.00 

35.06 

25.84 

Em 


N. Dak. 

35.06 

30.66 

36.87 

CO 


N. Dak. 

37.47 

36.27 

35.06 

40.34 

wM 

N. Dak. 

19.37 

16.43 

15.34 

17.05 

MM 

S. Dak. 

31.95 

39.23 

29.33 

' ■ ’ ■ ' 


S. Dak. 

25.10 

30.66 

17.46 

EO 

■EH 

S. Dak. 

23.58 

25.84 

20.27 

22.46 


21 































TABLE 18.- DIFFERENCES BETWEEN WINTER GRAIN 
PROPORTION ESTIMATES AND GT 


Segment 

State 

Labeling procedjre 

t> 

30 

A'U 

A'AI 

1005 

Colo. 

1.97 

7.76 

-9.53 

■EB 

Kans. 

-0.94 

0.82 

- 9.75 


Kans. 

- 0.49 

3.18 

-982 


Kans. 

2.87 

2.87 

- 2.74 

US 

"'jns. 

- 22.27 

- 6.45 

- 2.62 

1739 

Mont. 

- 2.98 

- 2.98 

- 11.82 

1582 

Neb. 

-2.55 

-4.16 

- 1 .03 

mm 

Okla. 

SEB 


1.26 

El 

Okla. 

mEm 


2 . T 8 

1367 

Okla. 

4.65 


- in . 42 

mm 

S. Oak. 

2.11 

3.96 

2.11 

■a 

S. Dak. 

- 0.55 

- 6.29 

-C.29 

mm 

Tex. 


2.28 

6.07 

KB 

Tex. 


- 0.86 

- 3.78 

1060 

Tex. 

— 

KS9 

- 0.76 

- 4.38 


= estimate - GT. 


22 






































TABLE 19.- DIFFERENCES BETWEEN TRANSFORMED SPRING 
GRAIN PROPORTION ESTIMATES AND GT 


Segment 

State 

Labeling procedure 

A'R 

• 'U 

A'AI 

IB 

Minn. 

1 

-3.54 

-1.66 

IB 

Minn. 


-5.24 

-5.93 


Mont. 


1.00 

2.75 

| 

Mont. 

■il 

13.60 

8.46 


N. Dak. 

-0.80 


-11.25 

WBM 

N. Dak. 

-5.00 


-9.16 

BB 

N. Dak. 

-2.94 

-7.34 

-1.13 


N. Dak. 

-2.87 

-4.07 

-5.28 

Kfl 

N. Dak. 

2.32 

-.62 

-1.71 

mm 

S. Dak. 

-3.78 

3.50 

-6.40 

n 

S. Dak. 

-2.39 

3.17 

-10.03 

■9 

S. Dak. 

1.12 

3.38 

-2.19 


A' * estimate - CT. 


23 


























TABLE 20.- ANALYSIS OF VARIANCE FOR WINTER 
GRAIN PROPORTION DIFFERENCES 


Source 'if variation 

Degrees 
of freedom 

Sum of 
squares 

Mean 

square 

F-value + 

State 

6 

154.07 

25.68 

0.86 

Segment within state 

8 

330.76 

41.35 

1.39 

Treatment 

2 

135.62 

67.81 

2.28 

State by treatment 

12 

216.64 

18.05 

0.61 

Error* 

16 

475.48 

29.72 


Total 

44 

1 :.2.57 




+ 

The proportion differences are not significantly different at the 
5-percent level. 

Coefficient of variation for error * 2.88 percent. 


TABLE 21.- ANALYSIS OF VARIANCE FOR SPRING 
GRAIN PROPORTION DIFFERENCES 


Source of variation 

Degrees 
of freedom 

Sum of 
squares 

Mean 

square 

F-value 

Statr 

3 

303.93 

101.31 

10.02 

Seqment w'thin state 

8 

510.20 

63.78 

6.31 

Treatment 

2 

99.89 

49.95 

4.94 

State by treatment 

6 

196.88 

32.81 

3.25 

Error* 

16 

161 .83 

10.11 


Total 

35 

1272.73 




A significant difference between proportion differences at the 5-percent 
level were noted. 

* 

Coefficient of variation for error = 2.72 percent. 


24 





















TABLE 22.- STATE-BY-TREATMENT MEANS 
FOR WINTER GRAIN SEGMENTS 


State 

No. of 
segments 

Treatment 

State 

average 

Random 

Uniform 

AI 

Colo. 

1 

3.0 

13.0 

-15.0 

0.3 

Kans. 

4 

-9.5 


-4.8 

-5.8 

Mont . 

1 

-4.0 


-15.0 

-7.6 

Neb. 

1 

-3.0 

-5.0 

-1.0 

-3.0 

Okla. 

3 

3.0 

2.3 

-4.0 

0.4 

S. Dak. 

2 

0.5 

0.5 

0.0 

0.3 

Tex. 

3 

-5.3 

0.0 

-1.3 

-2.2 

Treatment 

average 

-2.2 

0.5 

-5.9 

-2.5 


TABLE 23.- STATE-BY-TREATMENT MEANS 
FOR SPRING GRAIN SEGMENTS 


State 

No. of 
segments 

Treatment* 1 

State 

average 

LSD values 

Random 

Uni form 

AI 


At 1% 

Minn. 

2 

*-13.0 

'-7.0 

-6.0 

-8.6 

5.6 

9.3 

Mont. 

2 

*15.5 

7.5 

6.0 

9.7 

5.6 

9.3 



★ 

★ 

. 




N. Dak. 

5 

-2.8 

-3.4 

-8.6 

-4.9 

3.5 

5.9 



* 

it 

• U 




S. Dak. 

3 

-2.7 

5.0 

-8.3 

-2.0 

4.5 

7.6 

Treatment average 

-0.8 

-0.5 

-4.2 

-1.5 



a Any two treatment means superscribed by the same symbol (asterisk or 
dagger) are not significantly different from each other at the 5^ level. 


25 

































uniform treatments were not significantly different from each other but were 
both significantly better than the AI treatment. Consideration of all 
spring grain states investigat* .1 indicates that the uniform treatment was 
not significantly different from the best of the treatments. 


26 



4. COMPUTATION OF THE VARIANCE OF THE ESTIMATE 
AND THE REDUCTION COEFFICIENT 


In this section, the variance of the estimate and the reduction coefficient (R) 
are derived. 


Let x^, x n denote the spectral samples of type 2 dots and 0^ be a 

function of x n> where 

_ |1 if pixel j of class i Is wheat 
|o if pixel j of class i is nonwheat 

Let 

N = total number of type 2 dots. 

N, = number of type 2 dots in wheat strata. 

N - N^ = number of type 2 dots in nonwheat strata. 

\ = machine estimate of wheat. 

The proportion estimate can be expressed as 

N ] N-N ] 

P N N,“2 °H + N - N, 2 °0i (1 

1 i=l 1 i=l 

= XP 11 + ( I " A ) p io 

where 


i 

= Pr[labeled wheat I classified wheat] - ~ ^ ] 0^ 


i = l 


P 1Q = Pr[labeled wheat I classified nonwheat] = ^ - 1 M 0 n ^ 


N-N-j 


N - N, ^ Oi 
i = l 


27 


i 



The variance of the estimator is expressed as 


2 p i 1 ” p ii) ? p m^ ” p in) 

Var(P N ) = X Z — ♦ (1 - X) Z 10 10 


’1 


TT“N 


1 


Assume tt is the probability that an analyst labels a pixel wheat and 
XP^ = Pr(classi fied W) • Pr(labeled W|classified W) 

= Pr(labeled W, classified W) 

= Pr(labeled W) • Pr(classi f ied W | 1 abel ed W) 


= 7T7T 


11 


( 2 ) 


where 

tj^i = Pr(classified Wjlabeled W), 
ir 0 i = Pr(classified N|labeled W), 

and 

N ] = XN , N - N-, = (1 - X)N. 

Then equation (2) can be expressed as 

„ /X x xp n ' X(1 ‘ P n ) . (1 " x)p io * (1 ' X)(1 ’ P 10 ) 

Varl V + n - N ] 

Pr(labeled W, classified W)Pr(labeled N, classified W) 

N 1 

Pr(labeled W, c^ssified N) • Pr(labeled N, classified N) 

N - N 1 

71 n 1 1 ( 1 " "^lo ^Ol' 1 ■ "^OO 
“ N“ + N - N, 


28 



(3) 


Var(f N ) . ,(! . 

■ '^PP* * »] 

Using * ]} - 1 - ir 01 , 7r 00 - 1 - tt 10 , 
then \ - Pr(classified W) 

* Pr(classified W, labeled W) + Pr(classified W, labeled N) 
■ tttt 1 1 + (1 - ’Ot'to 

* tt( 1 - ^qi ) + (1 - "b 10 
and 


1 - X ■ Pr(classif ied N) 

= Pr(classified N, labeled W) + Pr(classi fied N, labeled N) 

8 ™oi + O ■ "Ho 

= TtTr 01 + (1 - tt) ( 1 - TT 10 ) 

Thus, substituting into equation (3) 


Var(P N ) 


(1 - *01)1110 


"(1 - TTqi ) ♦ 0 - 7T')TT 


10 


= R(HV^) 


"oi^ 1 ' "lO* 

"(1 - i0 

7| + (1 - ir)(l - * 10 ) 

N 

_ 



(4) 


where R is known as the reduction coefficient and the expression | - - 

is generally known as the sampling error. The expression for R is easily 
computed from the omission and commission errors for the type 2 dots and 
can be viewed as an indication of how much the machine classification 
improves the proportion estimation. 


The R-values were computed for GT-labeled machine classifications which 
were performed for the secondary error analysis study using the random grid 


29 



system and for Al-labeled machine classifications which we^e performed for 
Phase III LACIE processing. The machine classifications in both cases 
were compared to GT labels. For three segments (1520, 1739, and 1861), 
the Phase III processing results were unavailable for analysis. Tables 24 
and 25 present the raw data and computed R-values for the GT-labeled machine 
classifications and the Al-labeled machine classifications, respectively. 
Figures 1 and 2, representing the computations from tables 24 and 25, respec- 
tively, plot the GT proportion estimate (p) versus the computed R-value. The 
mean reduction coefficient (R) values (from tables 24 and 25) are as follows: 

1. GT-labeled random grid - 0.718 

2. Al-labeled random grid - 0.714 

The standard deviations on these estimates are 0.217 and 0.132, respectively. 


30 



TABLE 24.- GT-LABELED MACHINE CLASSIFICATIONS 


Segment 

State 

*10 

*01 

0 

n 

R 



1005 

Colo. 

0.108 

0.565 

0.347 

60 

0.859 

0.00378 

0.00324 

IBS 

Kans. 

0.200 

0.292 

0.386 

59 

0.744 

0.00402 

0.00299 

IBS 

Kans. 

.039 

.667 

.095 

58 

.881 

.00148 

.00131 


Kans. 

.000 

.500 

.303 

60 

.589 

.00352 

.00207 

HI 

Kans. 

.171 

.500 

a . 353 

43 

.879 

.00531 

.00467 

■59 

Minn. 


0.714 

1 

59 

0.999 

0.00379 

0.00378 

Id 

Minn. 


.133 


60 

.667 

.00350 

.00233 

1544 

Mont. 



0.383 

60 

0.874 

0.00394 

0.00344 

1 ’1739 

Mont. 

ko 


.284 

59 

.826 

.00345 

.00285 

1582 

Neb. 

0.019 

0.143 

0.194 

60 

0.261 

0.00261 

0.00068 

mm 

N. Oax. 

J.179 

0.281 


•60 

0.707 

0.00416 

0.00294 


N. Dak. 

.346 

.095 


47 

.723 

.00470 

.00340 


N. Dak. 

.175 

.650 


60 

.961 

.00392 

.00377 


N. Dak. 

.094 

.333 

.410 

53 

.640 

.00456 

.00292 

mm 

N. Dak. 

.057 

1.00 

.086 

60 

.995 

.00131 

.00130 

id 

Okla. 


0.022 

0.741 

59 

m 

0.00325 

0.00184 


Okla. 

fi 

.207 


55 

WM 

.00453 

.00187 

id 

Okla. 

lea 

.324 

■ddi 

50 

.982 

.00497 

.00488 

WR d 

S. Dak. 


0.733 

1 

51 

0.939 

0.00441 

0.00414 


S. Dak. 

kSI 

.333 

mm 

60 

.505 

■ 

.OL'69 


S. Dak. 

.000 

.500 

.011 

59 

.503 


.00009 

”1805 

S. Dak. 

.000 

.538 

.158 

91 

.580 

.00146 

.00085 

1056 

Tex. 

0.367 

0.273 

0.226 

60 

0.908 

0.00292 

0.00265 

1059 

Tex. 

.226 

.074 

.445 

57 

.513 

.00433 

.00222 

1060 

Tex. 

.000 

.250 

.231 

59 

.302 

.00301 

.00091 


a Dot count estimate of (IT was used. 
'’This segment is a mixed wheat site. 


31 
















































































TABLE 25.- AI-LABELED MACHINE CLASSIFICATION 


Segment 

State 

"10 

"01 

P 

n 

R 

p U. pJ 

n 

r£D - p) 
n 

1005 

Colo. 

0.081 

0.696 

0.347 

60 

0.915 

0.00378 

0.00346 

mm 

Kans. 

0.189 

0.227 

0.386 

59 

0.667 

0.00402 

0.00268 

ESI 

Kans. 

.038 

1.00 

.095 

58 

.996 

.00148 

.00148 

m 

Kans. 

.217 

.214 

.303 

60 

.712 

.00352 

.00251 

1512 

Minn. 

0.158 

0.429 

0.337 

59 

0.818 

0.00379 

0.00310 

1544 

Mont. 

0.156 

0.448 

0.383 

99 

0.826 

0.00239 

0.00197 

1582 

Neb. 

0.000 

0.250 

0.194 

60 

0.293 

0.00261 

0.00076 

1604 

N. Dak. 


0.581 

0.524 

60 

0.903 

0.00416 

0.00375 

1606 

N. Dak. 

.091 

.440 

.329 

47 

.738 


.00347 

1648 

N. Dak. 

.194 

.667 

.379 

60 

.976 

.00392 

.00383 

1661 

N. Dak. 

.194 

.409 

.410 

53 

.834 

.00456 

.00381 

1902 

N. Dak. 

.000 

■ 

1.00 

.086 

60 

(a) 



■ mm 

Okla. 

0.364 

0.021 


59 

0.509 

0.00325 

0.00166 

ESI 

Okla. 

.080 

.233 


55 

.512 

.00453 


m 

Okla. 

.095 

.172 


50 

.466 

.00497 


1677 

S. Dak. 


0.533 


51 

0.634 

0. 00441 

0.00279 

1690 

S. Dak. 


.545 


60 

.683 


.00191 

1803 

S. Dak. 


.500 


59 

.503 


.000093 

b l 805 

S. Dak. 




91 

.844 

.00146 

.00123 


Tex. 

0.068 

0.500 

0.226 

60 


0.00292 

0.00T23 


Tex. 

.259 

.200 

.445 

57 


.00433 

.00308 

1060 

Tex. 

.024 

.529 

.231 

59 


.00301 

.00205 


a This represents an extreme case for which the R-value does not exist. 
b This segment is a mixed wheat site. 


32 











































































R-value 







R-valuP 


1.00 


.90 


.80 


.70 


.60 


.50 


.40 


.30 


.20 

.00 .10 .20 .30 .40 .50 .60 .70 


o Spring wheat segment 
• Winter wheat segment 
a Mixed wheat segment 


GT proportion estimate ( p ) 


Figure 2.- Al-labe^ed machine classification. 


34 



5. CONCLUSIONS 


A comparison of PI proportion estimation results from the IBM study and the 
secondary error analysis study showed no significant differences between the 
two studies. 

Re-evaluation of the secondary error analysis data Indicated significant 
differences In the probabilities of correctly classifying small grains using 
type 1 dots (PCG1). The PCG1 for the Al-labeled random dot grid was signifi- 
cantly lower than both of the GT-labeled dot grids (l.e., the random and 

systematic dot grids). However, the two GT-labeled dot grids were not signif- 
icantly different from each other. The PCpl means were as follows: 

1. GT-labeled random dot grid - 80.6 percent 

2. GT-labeled uniform dot grid - 80.2 percent 

3. Al-labeled random dot grid - 60.0 percent 

superior performance of the GT-labeled random grid over the GT-labeled 
uniform grid can probably be attributed to differences In the purity of the 
type 1 dots used on the two grids. The analyst selected the type 1 dots used 
on the GT-labeled random grid with the aid of the Landsat imagery and GT 

information. The type 1 dots on the GT-uniform grid were selected by inspec- 

tion of GT images but without the aid of Landsat imagery to verify the purity 
of the type 1 dots. It is speculated that some boundary dots were inadvertently 
included in the type 1 GT-labeled uniform grid dots. 

The analysis of the probabilities of correcfy classifying small grains using 
type 2 dots (PCG2) showed the GT-labeled random dot grid provided significantly 
better performance than both the GT-labeled uniform grid and the Al-labeled 
random grid. No difference was noted between the PCG2's for the GT-labeled 
uniform grid and the Al-labeled random grid. The PCG2 means were as follows: 

1. GT-labeled random dot grid - 67.9 percent 

2. GT-labeled uniform dot grid - 58.4 percent 

3. Al-labeled random dot grid - 53.0 percent 


35 


The analyses of the signed differences between PI small-grain proportion 
estimates and GT proportions were Derformed separately on segments from 
winter wheat areas and spring wheat areas. In the winter wheat area, the 
proportion estimates obtained from PI were not significantly different from 
GT proportions. This was true for both the GT-labeled grids and the Al- 
labeled grid. 

In the spring wheat area, the analyses had to be performed on each state 
separately because of Interaction of Pi proportion estimates with states. 

The results Indicated that for Minnesota and Montana, PI proportion estimates 
obtained using the GT-labeled uniform grid and the Al-labeled random grid 
were significantly better than the GT-labeled random grid. However, PI 
proportion estimate*- from the GT-labeled uniform grid and the AI labeled grid 
were not significantly different from each other. For North Dakota and South 
Dakota, PI proportion estimates from the GT-labeled random grid and the GT- 
labeled uniform grid were both significantly better than the Al-labeled random 
grid. However, the PI proportion estimates from the two GT-labeled grids were 
not significantly different from each other. 

The efficiency of PI In reducing the variance of the proportion estimate 
obtained from bias correction using type 2 dots was computed. The mean reduc- 
tion coefficient (R) for the GT-labelel random grid and the Al-labeled random 
grid are as follows: 

1. GT-labeled random grid - 0.718 

2. Al-labeled random grid - 0.714 

The standard deviations on these estimates are 0.217 and 0.182, respectively. 

Clearly, PI does not provide much gain over a simple random sample proportion 
estimate from the type 2 dots. 


36 



6. REFERENCES 


1. Havens, K. A.: Secondary Error Analysis: The Evaluation of Analyst Dot 

Labeling. LEC-12380, Sept. 1978. 

2. Umberger, D. E.; Proctor, M. H.; Clark, J. E.; Eisgruber, L. M.; and 

Braschler, C. B.: Econometric Models for Predicting Confusion Crop 

Ratios. To be published In the Proceedings of the LACIE Symposium. 

3. Kirk, Roger E.: Experimental Design; Procedure for the Behavioral Sciences. 

Brooks/Cole Publishing Company, 1978, pp. 91-93. 


37 



